Question Analysis Report

Generated: 2025-07-04T10:39:33.683716

Executive Summary

Dataset Size:
9,098 observations
Features:
501 total
Models Analyzed:
7 outcomes
Best R²:
0.143

Model Performance Summary

Outcome Intercept Adj. R² F-statistic F p-value AIC BIC RMSE N Significant Features High VIF Features Mean VIF Max VIF Sample Size
news_proportion_left_leaning 18.5308*** 0.1208 0.1153 21.80 0.0000 88075.0 88487.7 30.5155 22 0 1.57 3.82 9,098
news_proportion_right_leaning 2.4627*** 0.0627 0.0568 10.60 0.0000 68033.2 68445.9 10.1431 21 0 1.57 3.82 9,098
news_proportion_center_leaning 78.8481*** 0.1428 0.1374 26.41 0.0000 88686.6 89099.3 31.5586 23 0 1.57 3.82 9,098
news_proportion_unknown_leaning 0.1584 0.0163 0.0101 2.63 0.0000 58252.2 58665.0 5.9255 10 0 1.57 3.82 9,098
news_proportion_high_quality 69.9295*** 0.1264 0.1209 22.95 0.0000 90273.8 90686.5 34.4349 31 0 1.57 3.82 9,098
news_proportion_low_quality 5.6895*** 0.0473 0.0413 7.88 0.0000 76393.1 76805.8 16.0584 15 0 1.57 3.82 9,098
news_proportion_unknown_quality 24.3810*** 0.1378 0.1323 25.34 0.0000 89027.3 89440.0 32.1550 30 0 1.57 3.82 9,098

Correlation Matrix

Feature Importance

Regression Coefficients by Outcome

news_proportion_left_leaning (R² = 0.121, 38 features)

news_proportion_right_leaning (R² = 0.063, 38 features)

news_proportion_center_leaning (R² = 0.143, 38 features)

news_proportion_unknown_leaning (R² = 0.016, 38 features)

news_proportion_high_quality (R² = 0.126, 38 features)

news_proportion_low_quality (R² = 0.047, 38 features)

news_proportion_unknown_quality (R² = 0.138, 38 features)

Model Family Comparisons

proportion_left_leaning

proportion_right_leaning

proportion_high_quality

proportion_news

num_citations

Multicollinearity Diagnostics

Interpretation: Variance Inflation Factor (VIF) measures multicollinearity.

news_proportion_left_leaning (High VIF: 0, Mean VIF: 1.57)

news_proportion_right_leaning (High VIF: 0, Mean VIF: 1.57)

news_proportion_center_leaning (High VIF: 0, Mean VIF: 1.57)

news_proportion_unknown_leaning (High VIF: 0, Mean VIF: 1.57)

news_proportion_high_quality (High VIF: 0, Mean VIF: 1.57)

news_proportion_low_quality (High VIF: 0, Mean VIF: 1.57)

news_proportion_unknown_quality (High VIF: 0, Mean VIF: 1.57)

Summary Statistics

Variable Type Mean Std Min Max N Missing
num_citations Citation Outcome 5.7652 5.1669 0.0000 46.0000 32,400 0
proportion_high_quality Citation Outcome 8.9662 21.3873 0.0000 100.0000 32,400 0
proportion_left_leaning Citation Outcome 1.6659 7.4564 0.0000 100.0000 32,400 0
proportion_right_leaning Citation Outcome 0.0819 1.2404 0.0000 50.0000 32,400 0
news_proportion_high_quality Citation Outcome 21.8282 39.9889 0.0000 100.0000 32,400 0
news_proportion_left_leaning Citation Outcome 4.7865 18.8207 0.0000 100.0000 32,400 0
news_proportion_right_leaning Citation Outcome 0.3746 5.5664 0.0000 100.0000 32,400 0
proportion_news Citation Outcome 10.7818 23.2094 0.0000 100.0000 32,400 0
turn_number Question/Response Feature 1.7057 2.0636 1.0000 39.0000 32,400 0
total_turns Question/Response Feature 2.5335 3.5807 1.0000 50.0000 32,400 0
question_length_chars_log Question/Response Feature -0.0000 1.0000 -3.8234 2.6358 32,400 0
question_length_words_log Question/Response Feature 0.0000 1.0000 -2.2585 2.9189 32,400 0
response_length_log Question/Response Feature -0.0000 1.0000 -7.0885 3.1220 32,400 0
response_word_count_log Question/Response Feature -0.0000 1.0000 -5.6188 2.9660 32,400 0
model_family_google Model Family 7,563 observations 23.3% - - 32,400 0
model_family_openai Model Family 11,168 observations 34.5% - - 32,400 0
model_family_perplexity Model Family 13,669 observations 42.2% - - 32,400 0

Technical Details

Regression Method: OLS_statsmodels

PCA Precomputed: True

PCA Used: True

Total Features: 57